Supervised Classification with Gaussian Networks. Filter and Wrapper Approaches
نویسندگان
چکیده
Bayesian network based classifiers are only able to handle discrete variables. They assume that variables are sampled from a multinomial distribution and most real-world domains involves continuous variables. A common practice to deal with continuous variables is to discretize them, with a subsequent loss of information. The continuous classifiers presented in this paper are supported by the Gaussian network paradigm, which assumes that variables follow a Gaussian distribution. A great advantage of Gaussian network is that they need O(n) parameters to model a complete graph. This work shows how classifiers, supported by the Bayesian network paradigm, can be adapted to deal with continuous variables without discretizing them. In addition, two novel classifier learning algorithms are introduced. The presented learning algorithms are ordered and grouped according to their structural complexity: from the simplest naive Bayes structures to k-dependence Bayesian classifiers and semi naive Bayes. Moreover, for each structure a filter and wrapper approaches are presented. All these classifiers are empirically evaluated using the Brier score and the predictive accuracy. The obtained results with both scores suggest that semi naive Bayes is the best classifier.
منابع مشابه
Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine
Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods. In filter methods, features subsets are selected due to some measu...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملDeveloping a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression
Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...
متن کاملFeatures of Gene Extraction by Nonlinear Support Vector Machines in Gene Expression Analysis
Statistical analysis on gene expression data from DNA microarray has enabled us to extract information from tissue and cell samples. Comparing two classes of gene expression datasets (e.g. datasets from normal tissues and cancerous tissues), we first choose discriminative genes, which have significantly different expression values between two classes and characterize each class. In the most cas...
متن کامل